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Abstract 

Since the early days of digital communication, Hidden Markov Models (HMMs) 
have now been routinely used in speech recognition, processing of natural 
languages, images, and in bioinformatics. An HMM (Xi,Yi)i>i assumes ob- 
servations X\, X%, . . . to be conditionally independent given an "explanotary" 
Markov process Y\,Yi, . . ., which itself is not observed; moreover, the con- 
ditional distribution of Xj depends solely on Yi. Central to the theory and 
applications of HMM is the Viterbi algorithm to find a maximum a poste- 
riori estimate qi :n — (q±, qi, . ■ ■ , q n ) of Yx :n given the observed data Xi :n . 
Maximum a posteriori paths are also called Viterbi paths or alignments. Re- 
cently, attempts have been made to study the behavior of Viterbi alignments 
of HMMs with two hidden states when n tends to infinity. It has indeed 
been shown that in some special cases a well-defined limiting Viterbi align- 
ment exists. While innovative, these attempts have relied on rather strong 
assumptions. This work proves the existence of infinite Viterbi alignments 
for virtually any HMM with two hidden states. 
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1 Introduction 

We consider hidden Markov models (HMM) (Y, X) with two hidden states. 
Namely, Y represents the hidden process Y\, Y2, ■ ■ ■ , which is an irreducible 
aperiodic Markov chain with state space S = {a, b}. In particular, the transi- 
tion probabilities P = {pi m ), l,rn € S, are positive and the stationary distri- 
bution 7r = 7rP is unique. For technical convenience, Y\ is assumed to follow 
7T, however, the results of the paper hold for arbitrary initial distributions. To 
every state I G S there corresponds an emission distribution Pi on X = M. d . 
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Given a realization yi :oo S S°° of Y, the observations Xi :oa :— Xi,X 2 , ■ ■ ■ 
are generated as follows. If Yi — a (resp. Yi — b), then Xi is distributed 
according to P a (resp. Pb) and independently of everything else. We refer to 
this model as the (general) 2-state HMM. 

In flCappe et &\\ ^005[ ), HMMs are called 'one of the most successful sta- 
tistical modelling ideas that have [emerged] in the last forty years'. Since 
their classical application to digital communication in 1960s (see further 
references in (Cappe et al, 2005| )), HMMs hav e had a defin i ng impact on 



the mainstream research in speech recognition ([Huang et al], |1990], |Jelinek 



1976|, [200l|, [McDermott and Hazeni [2004 [Ney et alj, |l994|Padmanabhan and| 



Pichenyl, |2002T Rabiner and Juang[|l993[ |Rabiner et al| |l986| , |5hu et al.[ [2003 



fcteinbiss et al 



1995|, ptrom et ajj , |1999| ), natural language models ( ] Ji and 



Bilmesl, |2006|, |Och and Ney|, |2000), and more r ecently computa t ional biology 



dDurbin et al.| , |1998| , [Eddyj , |200j, |Krogh| , |l99q , [Lomsadze et alj , gOOJ) . Thus, 
for example, DNA regions can be labeled as a, 'coding', or b, 'non-coding', 
with P a and Pb representing the respective distributions on the {A, C, G, T} 
alphabet. 

Given observations xun ■— xi,...,x n , and treating the hidden states 
Vv.n '■= Dii ■ ■ ■ j Dn as parameters, inference in HMMs typically involves v(xi- n ), 
a maximum a posteriori (MAP) estimate of Yi :n . It has now been recognized 
that '[in] spite of the theoretical and practical importa nce of th e MA P path 
estimator, very little is known about its properties' ( |Caliebe , 2006| ). The 
same estimates are also known as Viterbi, or forced alignments and can be 
efficiently computed by a dynamic programming algorithm also bearing the 
name of Viterbi. When substituted for true y±. n in the likelihood function 
Myi:n',xi :n ,ip), Viterbi alignments can also be used to estimate ip, any un- 
known free parameters of the model. Starting with an initial guess tp^ and 
alternating between maximization of the likelihood A(yi :n ; xi- n , ip) in yx-. n 
and ip is at the core of Viterbi trai ning (VT), or extractio n ( Jelinekj, |1976 ), 
also known as segmental K-means ( Ephraim and Merhav| , [2002| , |Juang anc| 



Rabiner, 1990). Resulting estimates i/vt(ii:h, tp^) are known to be differ- 
ent from the maximum likelihood (ML) estimates ^Mh(xi :n , ip^) which in 



this case are most commonly delivered by the EM procedure (Baum and 
Petrie, 1966, Bilmes, 199? , Ephraim and Merhav, 2002). Even if ip were 



known, Viterbi alignments v(x\. n ; ip) would typically differ fr om true paths 
yi-.n, and the long-run properties of v(xi- n ; ip ) are not obvious (Caliebe , 2006 , 
Caliebe and Rosier , 2002] , Koloydenko et al , 2007 , Lember and Koloydenko , 



2007 , 2008|) . Furthermore, ( |Koloydenko et alj , |2QQ7| , [Lember and Koloydenkcj , 



2007| , |2008| ) propose a hybrid of VT and EM which takes into account the 
asymptotic discrepancy between ipML{xi; n ,ip^) and Vvr(£i:n, V'' '') m order 
to increase computational and statistical efficiencies of estimation of tp for n 
large. Thus or otherwise, an important question is how to find the asymp- 
totic properties of Viterbi alignments, given that (n + l) th observation can in 
principle change the previous alignment entirely, i.e. u(^i :n +i)i ^ v ( x i-.n)i, 
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1 < i < n? Do the Viterbi alignments then admit well-defined extensions? 
We answer this question positively in ( Lember and Koloydenkc| , 20Qg| ) for 
general HMMs (in particular, allowing more than two hidden states) by con- 
structing proper infinite Viterbi alignments. Generalizing and clarifying re- 
l ated results of (|Caliebc , 2006|, |Caliebe and Rosier , 2002), the approach in 
(Lember and Koloydenko, 2008| ) is to extend alignments piecewise, separat- 
ing individual pieces by nodes (see §|| below). Although the construction is 
natural, a detailed formal proof of its correctness for general HMMs is rather 
long and requires certain mild technical assumptions. This paper, on the 
other hand, shows that in the special case of two state HMMs, the existence 
of infinite Viterbi alignments needs no special assumptions and can be proven 
considerably more e asily. The results of this paper essential ly complete and 
generalize those of (Caliebe, 2006, Caliebe and Rosier, 2002). 



2 Preliminaries 



Let A be a suitable c-finite reference measure on M. d so that P a and Pb have 
densities with respect to A. For example, A can be a Lebesgue measure, or, 
as in the case of discrete observations, a counting measure. Thus, let f a and 
fb be the densities of P a and Pb, respectively. Throughout the rest of the 
paper, we assume that P a ^ Pb or, equivalently, 



\{x £ X : f a (x) ? f b (x)} > 0. 



(1) 



Assumption ([j]) is natural since there would be no need to model the observa- 
tion process by an HMM should the emission distributions coincide. Note also 
that unlike in the general case, the positivity of the transition probabilities is 
also a natural assumption for the two state HMMs. No more assumption on 
the HMM i s made in this paper. In particular, unlike ( Caliebe , 2006t Caliebej 
|and Roslerj , |2002[ ), we do not assume the square integrability of log(/ a //b), 
or equality of the supports of P a and Pb. However, the latter condition is 
not very restrictive, since for the two state HMMs with unequal supports 
the existence of infinite Viterbi alignments follows rather trivially (Corollary 



Thus, for any n > 1 and any x\ :n G X n and yx-.n 

G S n , the likelihood 

A 7r (yi : „;cci:„) is given by 



P(Fl:, 



2/1: 



/<,,(£*), where P(Yi m = yi :n ) = ir yi Y\_Py,-n 



Since estimation of V is not a goal of this paper, the dependence on ip is 
suppressed. Decomposition (|j) and recursion (j^) below provide a basis for the 
Viterbi algorithm to compute alignments. Namely, for all u G {1,2,..., n— 1}, 



max A n (yi :n ;x 1:n ) = max 



S u (l) x max 



A( Pi .)(2/u+l:n; Xu+l:n) 



. (2) 
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where (pi.) is the transition distribution given state I S S, and the scores 
8 u (l) := max A((yi !tt _i, Z); l = a,b, 

are defined for all u > 1, and xi :u G Af". Thus, <5 U (Z) is the maximum of the 
likelihood of the paths terminating at u in state I. Note that 5\{l) — nifi(xi) 
and <5„(Z) depends on xi :u . 

S u +i(a) = max{6 u (a)p aa ,5 u (b)p ba }f a (x u+ i), (3) 
S u +i(b) = max{5 u (a)p ab ,6 u (b)p bb }f b (x u+1 ), u > 1, 



Example 2.1 Let X\, X2, . . . fee o.d. following a mixture distribution 7r Q P a + 
7T{,P{, w«i/i density Tr a fa{x;0 a ) + Tt b f b (x;0 b ) and mixing weights ir a ,ir b > 0. 
Such a sequence is an HMM with the transition probabilities n a = p aa = p ba , 
Kb — Pbb = Pab- In this special case the alignment is easy to exhibit. Indeed, 
in this case recursion (|^) writes for any u > 1 as 

$u+i(a) = cir a f a (x u+ i), S u+ i(b) = cn b f b (x u+ i), (4) 

where c = max{5 u (a),5 u (b)}, Hence, the alignment v(xi- n ) can be obtained 
pointwise as follows: 

v{xi; n ) = (v(xi), . . .,v(x n )), where v(x) = argmax{7r Q / a (a;), ir b f b (x)}. 

Equivalently (ignoring possible ties), using a generalized Voronoi partition 
X = X a U X b with 

X a = {x e X : ir a fa(x) > ir b f b (x)}, X b = {x S X : ir b f b (x) > ir a f a (x)}, 

v(x) = a if and only if x € X a , and otherwise (i.e. x G X b ) v(x) = b. 

Generally, it follows from @ that, if 

S u (a)Paa > $u(b)p ba , 8 u (a)p ab > S u (b)p bb , (5) 

for some u, 1 < u, and some x\- u € X u , then for any n > u and for any 
extension x u +i :n S X n ~~ u , the Viterbi alignment goes through state a at time 
u. Namely, truncation v(xi :n )i :u coincides with the Viterbi alignment v(xi :u ) 
(indeed, (||) implies S u (a) > S u (b)). Thus, under condition (|[), maximization 
of A 7r ((j/i :n ,, I); xi-.n) can be reset at time u by clearing x\-. u from the memory, 
retaining vi-, u , and replacing the initial distributi on tt by (p a .) for further 
maximization of At p ,)(y u +i:n'i x u+i:n)- Following (Lember and Koloydenko, 
^008 ), if condition (^|) holds, then x u is called a strong a-node (of realization 
x\.. n , n> u), where 'strong' refers to the inequalities in (jjj) being strict. 

Suppose xi-.oo contains infinitely many strong a-nodes at times u\ < U2 < 
. . .. Let v 1 = v(xi :Ul ), and let v k maximize A-^ Pa ^(y Uk _ 1+1 ., Uk ;x Uk _ 1+1:Uk ), for 
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all k>2. Then, concatenation (v 1 , v 2 , v 3 



is naturally called the infinite 



piecewise Viterbi alignment ( Lember and Koloydenko , [2008 ). Apparently, 
the almost sure existence of our infinite alignments directly dependends on 
the existence of infinitely many (strong) nodes. At the same time, whether 
or not x u is a node depends on x\- u and hence is difficult to verify directly. 
Fortunately, in many cases x u is guaranteed to be a node based on several 
preceding observations x u -m-.u, 1 < m < u, ignoring the rest. Specifically, 
suppose for example that x 6 X is such that 



Piafa{x)Paj > Pibfb{x)pbj, Vl, j £ S. 



(6) 



It is easy to check that for any u > 2, x u — x is a strong a-node for any 
%i:u— l- Hence, if xi :00 contains infinitely many observations satisfying (O), 
then Xi-.oo also contains infinitely many strong nodes. This previous condition 
in its turn is met provided 



\({x £ X : Piafa{x)p a j > Pibfb(x)pbj, Vi, j £ S}) > 0. 



(7) 



Indeed, since our underlying M arkov chain Y is ergod i c, it i s rather easy to 
see that X is ergodic as well ( Ephraim and Merhav , 2002 , Genon-Catalot 
et al.| , 12000] , |Lerou4 |l992| ) . Also, (|7|) implies that 

P a ({x £ X : p ia fa(x)paj > Pibfb{x)pbj, Vi, j £ S}) > 0. 

Thus, it follows from ergodicity of X that almost every realization of X 
has infinitely many elements satisfying (^) and, hence infinitely many strong 
nodes. We have thus proved the following Lemma. 

Lemma 2.1 Assume that (Q) holds. Then almost every sequence of obser- 
vations xi :oc has infinitely many strong a-nodes. 

(Clearly , int erchanging a and b gives the same results in terms of & -nodes.) 
Lemma |2.l] is essentially Theorem 1 in ( Caliebe and Rosier , 2002; ) (disre- 
garding a misprint in the statement) . Condition (|7|) holds for many two-stat e 
HMMs including the so-called additive Gaussian noise model ( Calieb^ , |2006|) , 
where the emission distributions are Gaussian. Another trivial example is the 
model with unequal supports of P a and P&. Indeed, in that case (0) holds 
(at least up to swapping a and b). Hence, the following Corollary. 

Corollary 2.1 If the supports of P a and Pb are not equal, then almost every 
sequence of observations has infinitely many strong nodes. 



The goal of this work is ess entially to remove condition (|7|) from Lemma 2A . 

To this end, following ( Lember and Koloydenkc , 2008; ), we call an ob- 
servation satisfying (||) an a-barrier of length 1. More generally, a block of 
observations Zi-.k £ X k is called a (strong) barrier of length k > 1 if for 
every m > and x\-. m £ X m , zt-k contains a (strong) node of realization 
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(xi :m , zi : k). In ( Lember and Koloydenko , 20Qg| ), we prove the existence of 
infinitely many barriers for a very gener al class of HMMs. For the two-s tate 
HMMs, the conditions of our result in (Lember and Koloydenko, 2008) are 
given by (§) and (|) below. 

P a ({x £ X : f a (x) max{p aa , p ba } > f b (x) max{p fcb ,p ab }}) > and (8) 
P b {{x G X : f b (x) max{p bb ,p Qb } > f a (x) max{p aa ,p ba }}) > 0. (9) 

To achieve our goal, we will first prove the same result for the two-state HMM 
under th e re laxed assumption that (||) or (^|) holds. As we shall see below 
(Lemma 3T ) , in our two-state HMM one of these conditions is automatically 
satisfied and, moreover, all barriers are strong. Hence, occurrence of infinitely 
many strong barriers in this case will be shown (Theorem 4T) to require no 
additional assumptions. 

Finally, if a node is not strong and v(xi :n ) is not unique, an alignment 
might exist that does not go through this node. Such type of pathologies 
cause technical inconveniences in defining an infinite Viterbi alignment and 
are treated in (Lember and Koloydenko, 2008| ). Fortunately, unlike in the 
general case, in the case of two-state HM Ms almost every realization has in- 
finitely many strong nodes (Theorem 41 ) . This allows for a simple resolution 
of the non-uniqueness in the case of two-state HMMs. 



3 Main results 

3.1 Three types of the two-state HMM 

The following three cases exhaust all the possibilities: 

1. Paa > Pba Pbb > Pab); 

2. Paa < Pba Pbb < Pab)\ 

3. Paa = Pba Pbb=Pab)- 

From the definition of nodes, it follows that x u is not a node only in one of 
the following two cases: 



(A) 



5 u {a)paa > 5 u (b)p ba 



8 u (b)pbb > 5 u {a)pab 
Case (A) is equivalent to 



(B) 



a a 

5 u {a)pab > 5u{b)pbb 



Pbb S u (a) p^ 

Pab S u {b) Paa 



and case (B) is equivalent to 



Pbb _^ S u (a) Pba 

Pab S u (b) p aa ' 



(10) 



(11) 
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Thus, in case (A), we have S u+1 (a) = S u (a)p aa f a (x u+ i) and 6 u+1 (b) = 
5 u (b)pbbfb(x u +i), so that for any n > u, the Viterbi alignment v(xi- n ) must 
satisfy v(xi :n ) u = v(x 1:n ) u+1 . Similarly, incase (B) S u+1 (a) = S u (b)p ba f a (x u +i) 
and 5 u+ i(b) = 5 u {a)pabfb{x u +i), i.e. v(x 1:n ) u ^ v(x 1 .. n ) u+1 . Evidently, case 
1 and case (B) are mutually exclusive, and so are case 2 and case (A). There- 
fore, if the transition matrix satisfies the conditions of case 1, then x u is not 
a node if and only if conditions (A) are fulfilled. This implies that in case 1, 
nodes are the only possibility for v{x\- n ) to change state. On the other hand, 
if the transition matrix satisfies the conditions of case 2, then x u is not a 
node if and only if (B) holds. Hence, in case 2 nodes are the only possibility 
for v(x 1:n ) to remain in one state. Case 3 corresponds to the mixture model 
(see Example 2A above) . Apparently (H|) , every observation is a node in this 



case (see also Figure |l| below) . 

Let us now examine conditions (|J) and (^). From equation ([j]), it follows 
that 

A ({x e X : f a (x) > f b (x)}) > 0, \({x e X : f a (x) < f b (x)}) > (12) 
and, for any a > (3 > 0, 

A ({x e X : af a (x) > 0f b (x)}) > «■ P a {{x e X : af a (x) > 0f b {x)}) > (13) 
A ({x e X : af b (x) > f3f b (x)}) >Q^P b ({xeX: af b (y) > (3f b (y)}) > 0. (14) 

Therefore, we have the following Lemma. 

Lemma 3.1 Any two state HMM satisfies at least one of the condtions (||) 
and (^). 

Proof. In case 1 , (||) and @ are equivalent to 

Pa ({XEX: fa(x) P aa > fb(x) P bb}) = P a ( { X £ X : ME^L < lX) > Q (15) 



fa(x)p a 

P<A\r r ,y : f hU)m /„(,.),,.,.,}) =: Pi, j { , r- X : ^j^' < 1 } ) 0. ( If,) 



respectively. If p aa — p bb , then ( [L2|) implies that both ( |15[) and (|16[) are 
satisfied, and hence both (^) and (^) hold. If p aa > p bb , then (|l5[), and 
subsequently (||), follow from (|l3|). If p aa < p bb , then ([l6|), and subsequently 
(|9]), follow from ([jj]). Hence, at least one of the assumptions (g), @ is always 
guaranteed to hold. 

In case 2 , (g) and ([)]) are equivalent to 

Pa /»(*>&« > fb(x)pab)} = P a (jyX € X : < l}) > (17) 

/a(a:)pba 

fb(x)Pab 



P 6 ({a G A- : / (a;)p ao > / a (z)p 6o )} = P t Nx£^: < 1 }) > 0, (18) 
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respectively. Again, if p aa — pbb, then ( |l7| ) and ( |lg| ) both hold without further 
assumptions. If p aa > Pbb, then ( |l7| ) is automatically satisfied. Likewise, ( [l8| ) 
holds if Paa < Pfcb- Hence, one of the assumptions (||), (||) is always guaranteed 
to hold. 

In case 3 , (||) and ([)]) write 

Pa ({a; e A" : / a (a;)7r a > /b(z)7r 6 }) > 0, (19) 
A {{x S A : / 6 (a;)7r 6 > / Q (x)7r a }) > 0. (20) 

Assume 7r a > 7r . Then, (|l^) implies A ({x £ X : ir a f a (x) > nbfb{x)}) > 0, 
which in turn implies (|19|). ■ 

Finally, we state and prove the main results for each of the three cases. 



a q. o o O ° O ° ° 




Figure 1: Distinct patterns of the Viterbi alignment in the two-state HMM: 
Top: Case 1, state can possibly change only at nodes (larger circles). Middle: 
Case 2, states always alternate, except possibly at nodes. Bottom: Case 3, 
every observation is a node. 
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3.2 Case 1 

First, note that condition (Q) in this case is equivalent to 

A ({x 6 X : p ba fa(x)Pab > Pbbfb(x)Pbb}) > 0, (21) 

As mentioned in §|L condition (j?]) need not hold in general. Nonetheless, for 
the two-state HMM, we have the following Lemma. 

Lemma 3.2 In case 1, almost every realization of the two-state HMM has 
infinitely many strong barriers. 

Proof. Without loss of generality, assume p aa > Pbb- Then ( |l5| ) holds 
implying that there exists e > such that 

P a (A- a )>0, where X a := [x € X : fb f\ Phh < 1 - e 

I Ja(X)Paa 

Let integer k be sufficiently large for (1 — e) k < p a bPba/ (PaaPbb) to hold. Then 
every sequence Z\-k € X* satisfies 

n|^» <( i_^<». (22) 

j=l J°-( z j)Paa PaaPbb 

Let u > k be arbitrary and let zo-.k & %a +1 be the last k + 1 observations in a 
generic sequence a:^ € x X k+1 . To shorten the notation, we write 

dj(zi) for <5 M _fc+i(j) for every i — 0, 1, . . . , k, j = a,b. Next, we show that 
Xu-k-.u contains at least one strong node, and consequently, zo-.k is a strong 
barrier. Indeed, if none of the observations x u -k-.u were a strong a-node then 
we would have 

k 

db(z k ) = d b (z ) fb(zj)pbb- 

Similarly, if none among the observations x u -k+i-.u were a strong 6-node, we 
would have 

k 

Su^^Su-^pbaiUfa^M- 1 . 

Hence, 

s u (b) s u - k (b)pbb(Ylj=i MzjTiPbb 1 _ Y\ k j=iU'b{z 3 )pbb) Paa 



Su(a) 6 u - k {b)p ba ([[ k j=1 f a {Zj))p k aa l IIi=l(/a(^)Poo) P*» 

and by (§f) 

^(b) Pob 
<5«(a) Pfcb 
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that contradicts (|10|). Thus, at least one of x u -k-. u must be a strong node. 
Since P a (X a ) > 0, by ergodicity of HMM, almost every realization has in- 
finitely many barriers zo-k € X^ +1 , implying also that every realization has 
infinitely many strong nodes. ■ 

The next Theorem refines the previous result. 

Theorem 3.1 Suppose the (transition matrix of the) two-state HMM meets 
the condition of case 1. If p aa > Pbb, then almost every realization has in- 
finitely many strong a-barriers. (If p aa < Pbb, then almost every realization 
has infinitely many strong b-barriers.) 



Proof. Let p aa > Pbb and use the notation of the proof of Lemma p.2| . First, 
we show that none of the observations Xk- u +i-.u is a 6-node. Indeed, since 

d b {zi) = ma^{d a {z )p ab ,d b {z )p bb }f b (z 1 ), 

at least one of the following two inequalities must hold: 

Pabfb{zi)pba>Paafa{zi)Paa, Pbb fb{z\)pba > Pba fa{zi)Paa (23) 



in order for x u -k+i to be a 6-node. However, ( |15| ) implies that Pbafa(zi)p a a > 
Pbbfb(zi)p ba and, since pbb > p ab , we have p bb f b (zi)p ba > p a bfb{zi)pba- Hence, 
neither of the two inequalities ( p3| ) holds. Thus, x u -k+i cannot be a b- 
node, and the same argument shows that none of the subsequent observations 
Xu-k+2, ■ ■ ■ , x u can be a 6-node either. 



The argument of the proof of Lemma 3.2 then shows that one of the 
observations in x u -k-.u is a strong a-node and therefore zo-k is a strong a- 
barrier. The ergodic argument finishes the proof. (The same argument with 
a and b swapped establishes the second part of the Theorem.) ■ 

Note that the condition p bb > p aa is sufficient but not necessary for ( [l6| ) 
to hold. In fact, for many 2-state HMMs, such as the one with additive white 
Gaussian noise, both ( |l5| ) and ( |l6| ) hold for any (positive) values of p_aa and 
Pbb- On the other hand, it might happen that one of the conditions (|15| ) and 
( |l6|) , say (|l|), fails. This would mean P b ({x 6 X : p bb f b {x) > p aa fa(x)}) = 



or, equivalently, 

A ({x £ X : Pbb f b {x) > p aa fa(x)}) = 0. (24) 



Corollary 3.1 In case 1, equation ( |24[ ) implies that almost every sequence 
of observations has infinitely many strong a-barriers and no strong b-nodes. 
Furthermore, equation ( pi| ) in case 1 implies that for almost every realization, 
if a b-node does occur, it occurs before the first a-node. 



Proof. From the proof of Theorem 3.1, it follows that no observation x G X 



such that Pbbfb{x) < p a afa{x) (i.e. from the complement of the set in ([24])) 
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can be a strong 6-node; a closer inspection of the proof actually shows that 
even a weak (i.e. not strong) 6-node cannot occur after an a-node (since in 
case 1 pbb > Pba)- Theorem |3.l| then implies that almost every sequence of 
observations has infinitely many strong a-barriers. ■ 

Corollary 3.1 in its turn implies that starting with the first strong a- 
node onward, the Viterbi alignment v(xi :n ) stays in state a. As we have 



already mentioned, Viterbi alignments need not be unique (see (Lember and 
Koloydenko, 2008Q ), i.e. ties are possible in general, and in this case, in 
particular, they are possible up until the first strong a-node. However, the 
impossibility of strong 6-nodes in this case implies that the ties can be broken 

in favor of a, res ulting in the constant all a alignment. 

Theorem 3^ is a generalization of Theorem 7 in (Caliebe, |2006 ), which 
basically states that in case 1, if ( p^| ) and ( |l6|) hold then under some additional 
assumptions (equal supports of P a and Pb and furt her cond i tions A2), almost 
every realization has infinitely many nodes. Thus, ( Caliebe , 2006| ) stops short 
of realizing that in case 1 conditions ( 15|) and (|l6|) a lone ensure the existence 
of a— and fe-nodes. This results in ( Caliebe , |2006 ) invoking Theorem 2 of 



( jCaliebe and Rosier] , p002| ) to prove the existence of nod es, hence superfluous 
assum ptions Al, A2. Also the proof of Theorem 7 in ( Caliebe and Rosier , 
|2002 ) could be simplified and shortened with the help of the notions of nodes 
and b arriers. Finally, Corollary [Tl] generalizes Theorems 8 and 9 of ( Caliebe , 
|2006D . 



3.3 Case 2 

Recall that we have been proving the existence of barriers without condition 
Note that in case 2, condition (0) becomes 

A {{x € X : Paafa(x)p aa > Pabfb(x)pba}) > 0. 

Recall (^2|) also that interchanging a with b gives a similar condition for 
strong 6-nodes to occur infinitely often in almost every realization. 
It follows from (n2|) that for some e > 0, the sets 



X a := {x e X : f a {x)(l - e) > f b (x)}, X b := {x G X : f a (x) < f b {x)(l - e)} 

both have positive A-measure. Hence P a (X a ) > and Pb(Xb) > 0. Then, for 
%i-.2 S X a x Xb, the following holds: 



fb(x 1 )f a (x 2 ) 
fa(xi)fb(x 2 ) 



(25) 



Lemma 3.3 In case 2, almost every realization has infinitely many strong 
barriers. 
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Proof. Let X a and Xb be as above. Choose k sufficiently large for 

(1 — e) 2k < P aa P bb 

PbaPab 

to hold. Next, consider a sequence zo:2fc £ X 2k+1 , where zq, z-h £ X a , Zn-\ £ 
Xb, for every i — 1, . . . , k. We show that for every u > 2k, every sequence 
of observations x\ xu £ X u such that x u -2k-.u — zo-.2k, contains a strong node, 
making zo-2k a strong barrier. 

The choice of k and zo : 2fe implies 

Y[ k i= lPbafa{z2i-l)Pabfb{z2i) < (]_ _ £ )2fc < PbbPaa ^ 
rii=l Pabfb{Z2i-l)Pbafa{Z2i) PbaPab 

If there is no strong node among x u -2k-.u, then 

k 

db(z 2 k) = dfc(^o) IJpba/afe-lW/fc^) 

i=l 

and 

fe 

,Pbb 



Hence, by (p 



d a (z2k) > d b (z ) TT Pabfb(z2i-l)pbafa(Z2i)- 

Pab 



db(z 2 k) < IlLl Pbafa{z2i-l)Pabfb(z2i) Paa 
d a (z 2 k) ~ ^ b Y\ k l=1 Pabfb{z2 l -l)Pbafa{z2i) Vba 



which contradicts (jlj 

Next, we refine this result. Without loss of generality assume Pb a > Pab- 
Therefore 

PabPaa > PbaPbb, (27) 

and also, for every x £ X a , 

Pbafa{x) > Pabfb{x). (28) 



Hence, (|17| ) holds. We multiply the right side of ( |28[) by PbaPbb and the left 
side by p a bPaa, and use ( p7| ) to obtain 

f a (x)p aa > fb(x)pbb- (29) 

Finally, for i 6 A'j, we have 

/ Q (x) < fb(x). (30) 
We will need the following Lemma. 
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Lemma 3.4 Assume (in addition to being in case 2) that p a b <Pba- 

a) In any pair of observations z\a £ X a x X b , Z\ is not a b-node. 

b) In any pair of observations z 2:3 £ X b X A" a , is a b-node, then z 3 is a 

strong a-node. 

Proof. Assume that p a b < Pba, and consider a). First note that since we are 
in case 2, Z\ is a 6-node if and only if 

db(zi)pbb > d a (zi)pab- (31) 

Suppose first that zo is not a node, in which case db(zi) — d a (zo)p a bfb(zi) 
and d a (zi) = d b {z )p ba f a (zi). Then 

d a (zi)p ab = db(z )pbafa(zi)p a b > d a {zo)p a a fa{z\)p a b 

> d a (z a )pbbfb(zi)pab = d a (z a )p a bfb(zi)pbb = d b (zi)p bb . 

The first inequality above follows from the recursion property (^) of scores 
S, whereas the second one follows from (p9|). Thus, when zq is not a node, 
z\ cannot be a 6-node. Similarly, supposing that zq is an a-node, we obtain 
that z\ is not a 6-node. Suppose finally that zo is a 6-node. Then db{z\) = 
db{z )pbbfb(zi) and d a {z\) = d b (z )p ba f a (zi). Applying consecutively p bb < 
Pab, (H) and pbb < Pab again, we obtain: PbbM z i)Pbb < p a bfb{zi)pbb < 
Pbafa(zi)p b b < Pbafa(zi)p a b- Thus, contrary to (|l]) 

db{z\)pbb = d b (z )p bb f b (z 1 )p bb < d b (z )p ba f a (z 1 )p ab = d a (zi)p ab , 

that is, z\ is not a 6-node in this case either. Let us now prove b). If z^ 
is a 6-node, then d a (z 3 ) = d b {z 2 )pbafa{zs,) and d b (z 3 ) = d b (z 2 )pbbfb(z 3 ). By 
(|29|), we now have d a (z 3 )p aa = d b (z 2 )p ba f a (z 3 )paa > d b (z 2 )p b bfb{z 3 )pba = 
d b (z 3 )p ba . Similarly to the argument regarding 6-nodes guaranteed by ([u]) 
above, we now have d a (z 3 ) > d b (z 3 ), implying d a (z 3 )p ab > d b (z 3 )p bb . Thus 
z 3 is a strong a-node. ■ 

Theorem 3.2 If p ba > Pab, then almost every realization has infinitely many 
strong a-nodes. Ifpba < Pab, then almost every realization has infinitely many 
strong b-nodes. 

Proof. Assume again that pb a > Pab- Let zg :2 k be as in the proof of Lemma 
and attach one more element z 2 k+i G X b to the end. Thus, z 2 i £ X a and 

z 2 i+i e X b , i = 0, 1, . . . , k. 

From (the proof of) Lemma ^3 we know that zo :2 k contains at least one 
strong node. If this is an a-node, then the theorem is proven. Otherwise this 
is a 6-node, which, according to part a) of Lemma (3.4), can only be among 
zi, z 3 , . . . , z 2 k-i- Applying part b) of Lemma (3.4) shows that there must 
also be a strong a-node z 2 , z 4 , . . . , z 2 fe- Invoking ergodicity again finishes 
the proof. 
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Clearly, swapping a and b in the above discussion following the proof of 
Lemma |3.3| , establishes the other part of the theorem. ■ 

Inequality (p?j) guarantees (|l7|). Often, the model is such that in ad- 
dition to ( |l7|), a l so holds. However, to apply the previous proof (i.e. 
of Theorem |3.2[ ) to guarantee the simultaneous existence of infinitely many 
strong a and 6-nodes, we would need the following counterpart of (p9|): 

Pb({x £ X : fb(x)pab > fa(x)pba, fb(x)pbb > fa(x)p aa }) > 0, which is 

stronger than ([Ls|). However, this previous condition is indeed often met, 
resulting in infinitely many strong a- and 6-nodes (in almost every realiza- 
tion 3Ci :oo ). 

Lemma 3.3 app e ars w ithout proof as Theorem 10 in ( Caliebe , 200(j| ). The 
author of ( Caliebe , 2006|) actually suggests that Theorem 10 and other re- 
sults for case 2 are analogous to the corresponding resul ts for ca s e 1, m ainly 
Theorem 7 (of the same work). It is further stated in ( Caliebe , 2006 ) that 
the proofs of those results are not given as they "are very similar" to the cor- 
responding proofs in case 1. Our present workings actually show that case 
2 is quite dissimilar to case 1 (due to the fluctuating nature of the typical 
Viterbi alignment) and in p articular requir es a more careful treatment. Note 
that, even if T heore m 10 in ( ]Calieb4 |2O06| ) assumed (|) and (|) (as Theorem 
7 in flCaliebe , 2006|) does) to help one to prove this Theorem by analogy to 
Theorem 7, it is still not clear how the two proofs could be very similar. 



3.3.1 Case 3 (the mixture model) 

Recall that every observation in this case is a (not necessarily strong) node. 
Furthermore, every observation from {x e X : ir a f a (x) > TTbfb( x)} is a stron g 



a node. Thus, we have the following counterpart of Theorems 3.1 and 3.2 



Theorem 3.3 // ir a > -Kb, then almost every realization has infinitely many 
strong a-nodes. If ir a < irb, then almost every realization has infinitely many 
strong b-nodes. 



4 Conclusion 



In summary, we have proved Theorem 4.1 stated below and providing a basis 



for the piecewise construction and asymptotic analysis of the Viterbi align- 
ments of two-state HMMs. 

Theorem 4.1 Almost every realization of the two-state HMM has infinitely 
many strong barriers. Furthermore 

a) if the transition probabilities satisfy p aa > Pba then (almost every realiza- 
tion of) the chain has infinitely many strong s -barriers where s is such 
thatp ss = max{p QQ ,p 66 }, 
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b) otherwise (i.e. if p aa < Pba) (almost every realization of) the chain has 
infinitely many strong s-barriers where s is such that pt s = max{p a b,pb a } 
(for some t £ S). 
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